OpenAI finally unveils GPT-4.5. Here's what it can do

Earlier this month, OpenAI CEO Sam Altman shared a roadmap for its upcoming models, GPT-4.5 and GPT-5. In the X post, Altman shared that GPT-4.5, codenamed Orion internally, would be its last non-chain-of-thought model. Other than that, the details of the model remained a mystery — until today.
GPT-4.5 has launched
On Thursday morning, OpenAI ominously announced it would host a livestream in 4.5 hours, a hint at its latest and greatest model. During the livestream, OpenAI unveiled GPT-4.5 in a research preview, which the company claims is the “largest and most knowledgeable model yet.”
OpenAI said users should experience an overall improvement when using GPT-4.5, meaning fewer hallucinations, stronger alignment to their prompt intent, and improved emotional intelligence. Overall, interactions with the model should feel more intuitive and natural than with preceding models, mostly because of its deeper knowledge and improved contextual understanding.
Also: OpenAI’s reasoning models just got two useful updates
Unsupervised learning — which increases word knowledge and intuition — and reasoning were the two methods driving the model’s improvements. Even though this model does not offer chain-of-thought reasoning, which OpenAI’s o1 reasoning model does, it will still provide a higher level of reasoning with less of a lag and other improvements, such as social cue awareness.
For example, in the demo, ChatGPT was asked to output a text that conveyed a message of hate while running GPT-4.5 and o1. The o1 version took a bit longer, and only output one response, which took the hate memo very seriously, and sounded a bit harsh. The GPT-4.5 model offered two different responses, one that was lighter and one that was more serious. Neither explicitly mentioned hate; rather, they expressed their disappointment in how the “user” was choosing to behave.
Similarly, when both models were asked to provide information on a technical topic, GPT-4.5 provided an answer that flowed more naturally, compared to the more structured output of o1. Ultimately, GPT-4.5 is meant for everyday tasks across a variety of topics, including writing and solving practical problems.
Also: How to use OpenAI’s Sora to create stunning AI-generated videos
To achieve these improvements, the model was trained using new supervision techniques as well as traditional ones, such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF).
During the livestream, OpenAI took a trip down memory lane, asking all of its past models, starting with GPT-1, to answer the question, “Why is water salty?” As expected, every subsequent model gave a better answer than the last. The distinguishing factor for GPT-4.5 was what OpenAI called its “great personality,” which made the response lighter, more conversational, and more engaging to read by using techniques like alliteration.
The model integrates with some of ChatGPT’s most advanced features, including Search, Canvas, and file and image upload. It will not be available in multimodal features like Voice Mode, video, and screen sharing. In the future, OpenAI has said it plans on making transitioning between models a more seamless experience that doesn’t rely on the model picker.
Benchmarks
Of course, it wouldn’t be a model release without a dive into benchmarks. Across some of the major benchmarks used to evaluate these models, including Competition Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and SWE-Bench verified (coding), GPT-4.5 outperformed GPT-4o, its preceding general-purpose model.
Also: Want your Safari to default to ChatGPT for search? Here’s how to do it
Most notably, when compared to OpenAI o3-mini — OpenAI’s recently launched reasoning model, which was taught to think before it speaks — GPT-4.5 got a lot closer than GPT-4o did, even surpassing o3-mini in the SWE-Lancer Diamond (coding) and MMMLU (multilingual) benchmarks.
A big concern when using generative AI models is their predisposition to hallucinate or include incorrect information within responses. Two different hallucination evaluations, SimpleQA Accuracy and SimpleQA Hallucination, showed that GPT-4.5 was more accurate and hallucinated less than GPT-4o, o1, and o3-mini.
The results of comparative evaluations with human testers showed that GPT-4.5 is the more preferable model over GPT-4o. Particularly, human testers preferred it across everyday, professional, and creative queries.
Security
As always, OpenAI reassured the public that the models were deemed safe enough to be released, stress testing the model and detailing these results in the accompanying system card. The company also added that with every new release and increase in model capabilities, there are opportunities to make the models safer. For that reason, with the GPT-4.5 release, the company combined new supervision techniques with RLHF.
Availability
GPT-4.5 is in research preview for Pro users for now, accessible via the model picker on web, mobile, and desktop. If you don’t want to shell out the $200 for a Pro subscription, OpenAI shared it will begin rolling out GPT-4.5 to Plus and Team users next week, and then to Enterprise and Edu users the week after.
Also: OpenAI’s Deep Research can save you hours of work – and now it’s a lot cheaper to access
Altman shared on X that the goal was to launch the model for both Pro and Plus users at the same time, but that it is a “giant, expensive model.” He added that since the company ran out of GPUs, it will be adding tens of thousands of GPUs next week and roll the model out to Plus then.
The model is also being previewed to developers on all paid usage tiers in the Chat Completions API, Assistants API, and Batch API, according to OpenAI.