Anthropic's latest Claude AI models are here – and you can try one for free today


Anthropic

Since its founding in 2021, Anthropic has quickly become one of the leading AI companies and a worthy competitor to OpenAI, Google, and Microsoft with its Claude models. Building on this momentum, the company held its first developer conference, Thursday, — Code with Claude — which showcased what the company has done so far and where it is going next.

(Disclosure: Ziff Davis, ZDNET’s parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.)

Also: I let Google’s Jules AI agent into my code repo and it did four hours of work in an instant

Anthropic used the event stage to unveil two highly anticipated models, Claude Opus 4 and Claude Sonnet 4. Both offer improvements over their preceding models, including better performance in coding and reasoning. Beyond that, the company launched new features and tools for its models that should improve the user experience.

Keep reading to learn more about the new models.

Claude Opus 4

The Claude Opus family has always been the company’s most advanced, intelligent AI models geared toward complex tasks. While the Claude Opus 3 was already renowned as a highly capable model. The newest generation has made it even more so. Anthropic referred to it as the most powerful model yet and the best coding model in the world, supported by the results of the SWE-bench, which you can find below. 

Anthropic said Opus 4 was built to deliver sustained performance on complex, long-running tasks that require thousands of steps, significantly outperforming all of the Sonnet models. One of the biggest highlights is that the model can run autonomously for several hours, making Claude Opus 4 a great model for powering AI agents — the next frontier of AI assistance.

Also: The top 20 AI tools of 2025 – and the #1 thing to remember when you use them

The appeal of AI agents lies in their ability to perform tasks for people without intervention. To do so successfully, they need to reason through the next necessary steps, such as which tool to call on or what action to take. As a result, agents need a model that can reason well and sustain that reasoning over time — like Claude Opus 4.

Claude Sonnet 4

claude-4-model-selector.png

Anthropic

As the next generation of the Claude Sonnet family, Claude Sonnet 4 maintains the appeal of its preceding model, being a highly capable yet practical model fit for most people’s needs. Claude Sonnet 4 builds on the features of Claude Sonnet 3.7 with improved steerability, a term that describes how well a model can take human direction, reasoning, and coding. It will now be a drop-in replacement for Claude Sonnet 3.7 in the chatbot.

Other improvements to Claude

A new feature available in beta allows Opus 4 and Sonnet 4 to alternate between extended thinking and tool use, enabling users to experience an overall performance that combines speed with accuracy. Anthropic said Claude can also call tools in parallel, meaning it can call on multiple tools at once by either running them sequentially or simultaneously to execute the task at hand appropriately.

Also: Anthropic mapped Claude’s morality. Here’s what the chatbot values (and doesn’t)

When developers give Claude access to local files, it can now create and maintain “memory files” with the key insights, which allows for “better long-term task awareness, coherence, and performance on agent tasks,” according to Anthropic. Developers also get new capabilities in the Anthropic API for building more powerful agents, including the code execution tool, MCP connector, Files API, and prompt caching supported for up to one hour. 

Another improvement in both models is a 65% reduction in reward hacking — a behavior where the model takes shortcuts to complete a task — compared to Claude Sonnet 3.7, particularly on agentic coding tasks where this issue is common.

Users will also gain enhanced insight into the model’s thinking process with a new thinking summaries feature. This feature displays the model’s reasoning in digestible insights rather than a raw chain of thought when the thought processes are too lengthy. 

Anthropic said that the summarization will only be needed about 5% of the time, as most through processes are short enough to display entirely. Having insight into how the model arrived at a conclusion helps users verify its accuracy, identify any gaps in the process, and perhaps learn how they could have arrived at the answer themselves. 

Also: The tasks college students are using Claude AI for most, according to Anthropic

Anthropic also announced plans for the company’s future, including making the models ready for higher AI safety levels such as ASL-3 and providing more frequent model updates so that customers can access breakthrough capabilities faster.

Benchmarks 

As with any model release, the launch of Opus 4 and Sonnet 4 was accompanied by benchmark results. Both models demonstrated exceptional performance in coding tasks. On SWE-bench verified, a benchmark for evaluating large language models on real-world software challenges requiring agentic reasoning and multi-step code generation, Opus 4 and Sonnet 4 outperformed several leading models in the coding domain, including OpenAI Codex-1, OpenAI o3, GPT-4.1, and Gemini 2.5 Pro.

4-swe-bench.png

Anthropic

Beyond coding, Opus 4 and Sonnet 4 also performed competitively, either leading the categories or coming close to it, across other traditionally used benchmarks, including GPQA Diamond, which tests for graduate-level reasoning; AIME 2025, which tests high school match competition level; and the MMMLU, which tests for multilingual tasks.

claude-4-benchmarks.png

Anthropic

Availability

Claude Opus 4 and Sonnet 4 are hybrid models with a near-instant response mode and an extended reasoning mode for requests that require deeper analysis. Paid Claude plans, including Pro, Max, Team, and Enterprise, have access to both models and extended thinking. Claude Sonnet 4 is also available for free users.

Developers can access both models on the Anthropic API, Amazon Bedrock, Google Cloud, and Vertex AI. Anthropic shares that the price is consistent with previous models.

Bonus: Claude Code

Claude Code lets developers use Claude’s coding assistant directly where they write and manage code, whether that’s in the terminal, inside their IDE, or running in the background with the Claude Code SDK. For example, new beta extensions for VS Code and JetBrains allow users to integrate Claude Code within those IDEs, where Claude’s proposed edits will appear inline.

Also: I tested ChatGPT’s Deep Research against Gemini, Perplexity, and Grok AI to see which is best

Anthropic also announced the launch of a Claude Code SDK, which allows users to build their own AI-powered tools and agents while leveraging the same “core agent” as Claude Code to ensure they get the same level of assistance. As an example, Anthropic shared the launch of Claude Code on GitHub in beta, which allows users to call on Claude Code on PRs (pull requests) for assistance with modifying errors, responding to reviewer feedback, and more.

Get the morning’s top stories in your inbox each day with our Tech Today newsletter.





Source link

Leave a Comment