OpenAI reveals an updated GPT-4o model – but can't quite explain how it's better
There’s a new version of OpenAI’s GPT-4o model in town. But what it can precisely do seems to be a mystery, even to OpenAI. In an X post on Monday, the company spilled the beans, saying: “there’s a new GPT-4o model out in ChatGPT since last week. Hope you all are enjoying it and check it out if you haven’t! we think you’ll like it.”
Otherwise, OpenAI was mum about what improvements this new model offers. In updates to its X post, the company said that the new GPT-4o model is available for paid subscribers as well as those on the free tier (with a message cap). But it’s not GPT-4o-2024-08-06, which was also released last week and is now running on Microsoft Azure.
Also: Nvidia will train 100,000 California residents on AI in a first-of-its-kind partnership
Some ChatGPT users chimed in before Monday’s announcement, claiming they noticed a difference in the chatbot’s handling of requests and tasks. According to VentureBeat, several people felt that GPT-4o was behaving differently and better than in the past. Others said that GPT-4o’s native image generation skills through ChatGPT seemed to be kicking in. A few said that the upgrade improved multi-step reasoning.
In one X post, an account named @misaligned_agi said, “Wow, GPT-4o now uses multi-step reasoning. It’s impressive to see this in action. Turns out the update wasn’t a new model but a new method.”
With multi-step reasoning, an AI breaks down complex problems and questions into a smaller series of sequential steps, tackling each step individually, and then comes up with the response. The best example is a math problem that requires several calculations. The AI solves each equation to arrive at the overall answer.
However, a spokesperson for OpenAI told me that the speculation about multi-step reasoning missed the mark.
Also: The best AI for coding in 2024 (and what not to use)
After much theorizing among ChatGPT users, OpenAI finally shed some light about the update, now known as ChatGPT-4o-latest. The only thing is that the company’s explanation is still vague.
“Bug fixes and performance improvements … we’ve introduced an update to GPT-4o that we’ve found, through experiment results and qualitative feedback, ChatGPT users tend to prefer,” OpenAI said in its latest release notes on Tuesday. “It’s not a new frontier-class model. Although we’d like to tell you exactly how the model responses are different, figuring out how to granularly benchmark and communicate model behavior improvements is an ongoing area of research in itself (which we’re working on!).”
This suggests that OpenAI conjured up a new and improved model but doesn’t really know how or why it’s better. Hmm, OK. Further details in the release notes still didn’t answer the question.
Also: How safe is OpenAI’s GPT-4o? Here are the scores for privacy, copyright infringement, and more
“Sometimes we can point to new capabilities and specific improvements — and we’ll try our best to communicate that whenever possible,” OpenAI added in its notes. “In the meantime, our team is constantly iterating on the model by adding good data, removing bad data, and experimenting with new research methods based on user feedback, offline evaluations, and more. That’s the case with this model update.”
Here, it sounds like OpenAI is waiting for users to define the new model so that everyone can figure out what it actually does. In other words, OpenAI says to its users, “You tell me, and then we’ll both know.”
On its ChatGPT models page, the company provided a few specifics on ChatGPT-4o-latest. Described as a dynamic model continuously updated to the current version of GPT-4o, it’s intended for research and evaluation.
Also: Copilot Pro vs. ChatGPT Plus: Which AI chatbot is worth your $20 a month?
Trained on data up to October 2023, this latest model can handle 128,000 tokens, or 96,000 words, in a single conversation, the same amount as its predecessors. However, it can output up to 16,384 tokens, or 12,288 words, the same as GPT-4o-mini, but with an improvement of over 4,096 tokens in the original GPT-4o model.
Whatever new model or method OpenAI has added to GPT-4o, the results certainly seem worth the effort. The latest version landed at the top of the pack in testing at Chatbot Arena, a site that pits one AI chatbot model against another.
Listed under “anonymous-chatbot,” ChatGPT-4o-latest earned a score of 1315 based on more than 11,000 community votes, helping OpenAI reclaim the top spot from Google’s Gemini 1.5. Based on its performance, the new model showed a notable improvement in such technical domains as coding, following instructions, and hard prompts.
Also: AI risks are everywhere – and now MIT is adding them all to one database
If you want to see for yourself, taking ChatGPT-4o-latest for a spin yourself is simple enough. The new skills are already baked into the version of GPT-4o available with the ChatGPT website and mobile apps (as well as the API). ChatGPT Plus subscribers should make sure the model is set to GPT-4o, while free users can use the standard ChatGPT.
Try asking more complex and nuanced questions and see how the AI fares, especially compared with its past performance. Then, maybe together, we’ll figure out what this new model actually does.