xAI's Grok 3 is better than expected. How to try it for free (before you subscribe)


Getty Images / NurPhoto / Contributor

Elon Musk was an investor in OpenAI when it was founded in 2015. Since then, he’s completely severed his ties with the startup, alleging the company has departed from its original non-profit mission. He created his own AI company, xAI, and with it, a large language model (LLM) called Grok. Now, the company has launched a new model, Grok 3, which is soaring to the top of the chatbot leaderboards. 

Grok 3

On Monday, Elon Musk launched xAI’s latest family of AI models, Grok 3, via a live stream. Grok 3 boasts 10 times more training than Grok 2, made possible by xAI’s creation of its own Memphis, Tenn.-based data center, home to 200,000 GPUs. 

“We are excited to present Grok 3, which we think is an order of magnitude more capable than Grok 2,” said Musk during the livestream. 

The family of models also includes a reasoning model, which builds on Grok 3. Like other reasoning models on the market, including OpenAI’s o1 and o3 models, the Grok 3 Reasoning beta thinks for a bit longer to output higher-quality results. 

All Grok 3 models are meant to compete with leading models. Grok 3 competes with OpenAI’s GPT-4o and Google’s Gemini, and Grok 3 Reasoning competes with 03-mini (high), o1, and Deepseek-R1. With less than 24 hours on the market, xAI’s offerings are dominating benchmarks and leaderboards. 

Performance 

The model’s pre-training ended in early January, and even though it is still undergoing training, Grok 3 has outperformed leading models on AI benchmarks, including the AIME ’24, which tests for mathematical reasoning; GPQA, which tests for proficiency in science, specifically biology, physics, and chemistry; and the LCB Oct-Feb, which tests for coding capabilities. 

Grok 3 benchmarks

Grok

The Grok 3 reasoning model and Grok 3 mini reasoning model are still being developed, but according to results shared by xAI during the live stream, the betas of both models performed competitively against o3-mini (high), o1, DeepSeek-R1, and Gemini-2 Flash Thinking across the AIME, GPQA, and LCB. 

Reasoning model Grok 3

Screenshot by Sabrina Ortiz/ZDNET

Beyond technical benchmarks, Grok 3 climbed the charts on the Chatbot Arena, a crowdsourced platform where users can evaluate LLMs by chatting with two LLMs side by side and comparing their responses to each other without knowing the models’ names. 

Before the official launch of Grok 3, an early version of the model ran in the Arena under the title “chocolate,” and it placed first above Gemini, GPT-4o, DeepSeek r1, and more across all categories. It also became the first model to break a 1400 score in the Arena. 

DeepSearch

To meet the demand for agentic capabilities, xAI also launched DeepSearch, which is similar to OpenAI’s and Google’s deep research features. With DeepSearch, users can ask a question, and Grok will think it through, search the web, output its thinking process as it goes, and then generate a final, robust response with data and tables as necessary. This means you can ask it to research a topic, come back 10 minutes later, and the task will be completed. 

Also: ChatGPT’s Deep Research just identified 20 jobs it will replace. Is yours on the list?

One of the biggest standouts is being able to scroll through Grok’s thoughts — “reading through the mind of Grok” — and understanding how it landed on its final response. This makes the experience more steerable and helps you better understand your results. 

How to access

Grok-3 on X Premium+

Screenshot by Sabrina Ortiz/ZDNET

Starting today, you can access some of the Grok models in beta. Grok 3 is available on X Premium+, which also grants users access to the latest features, an increased usage limit, DeepSearch access, and advanced reasoning modes by clicking on the “Think” or “Big Brain” options. 

The X Premium+ subscription costs $40 per month, up from $22 before the announcement was made, as spotted by TechCrunch, and subscribers should update the app to see the updates. 

Also: These nations are banning DeepSeek AI – here’s why

xAI also unveiled a new subscription tier, SuperGrok, akin to ChatGPT Pro, meant for super fans who want the earliest access to the most advanced capabilities. This plan’s price is yet to be shared, but you can expect it to be a hefty penny, as OpenAI’s Pro subscription costs $200 per month

For the most polished version, Musk encourages users to wait a week. By then, a new voice integration will likely be ready to deploy. If you’d rather participate in the Chatbot Arena and let luck show you Grok 3, visit the website, click Arena side-by-side, and then enter a sample prompt. Even though the arena still has an early version of Grok 3, it’s still a powerful model; after all, it reached the top of the leaderboard compared to the other models, which are in their latest versions. 





Source link

Leave a Comment