OpenAI expands GPT-4.5 rollout. Here's how to access (and what it can do for you)


OpenAI

Last week, OpenAI launched GPT-4.5, which the company claims is the “largest and most knowledgeable model yet.” It was launched as a research preview available only to users subscribed to ChatGPT Pro, a $200-per-month plan. However, today, more OpenAI users can access it for much less money. 

Expanded GPT-4.5 access

On Wednesday morning, OpenAI announced via an X post that it began rolling out GPT-4.5 to ChatGPT Plus users. When first announced, OpenAI shared that the full rollout could take one to three hours. However, just an hour later, the full rollout of GPT-4.5 was completed, which was faster than expected, according to the X post. 

The model’s limits for ChatGPT Plus users aren’t clear. OpenAI said it plans to give everyone a “sizable rate limit,” but the rates will change as the company learns more about the model’s demand. ChatGPT Pro subscribers continue to have access to GPT-4.5, but if you want to try it out for less, you can with the ChatGPT Plus plan, which costs $20 per month. 

What is GPT-4.5?

At launch, OpenAI said users should experience an overall improvement when using GPT-4.5, meaning fewer hallucinations, stronger alignment to their prompt intent, and improved emotional intelligence. Overall, interactions with the model should feel more intuitive and natural than preceding models, mostly because of its deeper knowledge and improved contextual understanding.

Also: OpenAI’s reasoning models just got two useful updates

The two methods driving the model’s improvements were unsupervised learning — which increases word knowledge and intuition — and reasoning. Even though this model does not offer chain-of-thought reasoning, which OpenAI’s o1 reasoning model does, it will still provide a higher level of reasoning with less lag and other improvements, such as social cue awareness. 

For example, in the demo, ChatGPT was asked to output a text conveying a message of hate while running GPT-4.5 and o1. The o1 version took a bit longer and only output one response, which took the hate memo very seriously and sounded a bit harsh. The GPT-4.5 model offered two different responses, one lighter and one more serious. Neither explicitly mentioned hate; rather, they expressed their disappointment in how the “user” was choosing to behave. 

untitled-design-17

Screenshot by Sabrina Ortiz/ZDNET

Similarly, when both models were asked to provide information on a technical topic, GPT-4.5’s answer flowed more naturally compared to the more structured output of o1. Ultimately, GPT-4.5 is meant for everyday tasks across various topics, including writing and solving practical problems. 

Also: How to use OpenAI’s Sora to create stunning AI-generated videos

To achieve these improvements, the model was trained using new supervision techniques and traditional ones, such as supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). 

During the livestream, OpenAI took a trip down memory lane, asking all of its past models, starting with GPT-1, to answer the question, “Why is water salty?” As expected, every subsequent model gave a better answer than the last. The distinguishing factor for GPT-4.5 was what OpenAI called its “great personality,” which made the response lighter, more conversational, and more engaging to read using alliteration techniques. 

The model integrates with some of ChatGPT’s most advanced features, including Search, Canvas, and file and image upload. However, it will not be available in multimodal features like Voice Mode, video, and screen sharing. In the future, OpenAI has said it plans to make transitioning between models a more seamless experience that doesn’t rely on the model picker. 

Benchmarks

Of course, it wouldn’t be a model release without a dive into benchmarks. Across some of the major benchmarks used to evaluate these models, including Competition Math (AIME 2024), PhD-level Science Questions (GPQA Diamond), and SWE-Bench verified (coding), GPT-4.5 outperformed GPT-4o, its preceding general-purpose model. 

GPT-4.5 Benchmarks

OpenAI

Also: Want your Safari to default to ChatGPT for search? Here’s how to do it

Most notably, when compared to OpenAI o3-mini — OpenAI’s recently launched reasoning model, which was taught to think before it speaks — GPT-4.5 got a lot closer than GPT-4o did, even surpassing o3-mini in the SWE-Lancer Diamond (coding) and MMMLU (multilingual) benchmarks. 

A big concern when using generative AI models is their predisposition to hallucinate or include incorrect information within responses. Two different hallucination evaluations, SimpleQA Accuracy, and SimpleQA Hallucination, showed that GPT-4.5 was more accurate and hallucinated less than GPT-4o, o1, and o3-mini. 

SimpleQA Accuracy and SimpleQA Hallucination Rate

OpenAI

The results of comparative evaluations with human testers showed that GPT-4.5 is the preferable model over GPT-4o. Human testers preferred it for everyday, professional, and creative queries. 

Security

As always, OpenAI reassured the public that the models were deemed safe enough to be released, stress testing the model and detailing these results in the accompanying system card. The company also added that with every new release and increase in model capabilities, there are opportunities to make the models safer. For that reason, with the GPT-4.5 release, the company combined  new supervision techniques with RLHF. 





Source link

Leave a Comment