- Upgrade to Microsoft Office Pro and Windows 11 Pro with this bundle for 87% off
- Get 3 months of Xbox Game Pass Ultimate for 28% off
- Buy a Microsoft Project Pro or Microsoft Visio Pro license for just $18 with this deal
- How I optimized the cheapest 98-inch TV available to look and sound incredible (and it's $1,000 off)
- The best blood pressure watches of 2024
OpenAI's newly released GPT-4o mini dominates the Chatbot Arena. Here's why.
One week ago, OpenAI released GPT-4o mini. In that short time, it has already been updated and climbed the leaderboards of the Large Model Systems Organization (LMSYS) Chatbot Arena, ahead of giants such as Claude 3.5 Sonnet and Gemini Advanced.
The LMSYS Chatbot Arena is a crowdsourced platform where users can evaluate large language models (LLMs) by chatting with two LLMs side by side and comparing their responses to each other without knowing the models’ names.
Also: Want to try GPT-4o mini? 3 ways to access the smarter, cheaper AI model – and 2 are free
Immediately after its unveiling, GPT-4o mini was added to the Arena, where it quickly climbed to the top of the leaderboard behind GPT-4o. This is especially notable because GPT-4o mini is 20 times cheaper than its predecessor.
As the results came out, some users took to social media to express apprehensions about how such a new mini model could rank higher than more established, robust, and capable models such as Claude 3.5 Sonnet. To address the concerns, LMSYS — posting on X — explained the factors contributing to GPT-4o mini’s high placement, highlighting that the Chatbot Arena positions are informed by human preferences depending on the votes.
For users interested in learning which model works better, LMSYS encourages them to look at the per-category breakdowns to understand technical capabilities. These can be accessed by clicking the Category dropdown that says “Overall” and selecting a different category. When you visit the various category breakdowns — such as coding, hard prompts, and longer queries — you will see a variation in the results.
Also: OpenAI launches SearchGPT – here’s what it can do and how to access it
In the coding category, GPT-4o mini is ranked third behind GPT-4o and Claude 3.5 Sonnet, which holds first place. However, GPT-4o mini is number one in other categories, such as multi-turn, conversations greater than or equal to two turns, and longer query queries equal to or greater than 500 tokens.
If you want to try GPT-4o mini, visit the ChatGPT site and log into your OpenAI account. If you would rather participate in the Chatbot Arena and let luck show you GPT-4o mini, you can start by visiting the website, clicking Arena side-by-side, and then entering a sample prompt.