ReportWire

Anthropic Says Its Latest Claude AI Is ‘the Best Coding Model in the World’

[ad_1]

Anthropic has announced Claude Sonnet 4.5, the latest version of its default model. The company says the model isn’t just “the best coding model in the world,” it’s also “the strongest model for building complex agents.” In the context of AI, an agent is an AI model that uses tools that allow it to take actions, like running code and taking over an internet browser.

Anthropic said that when it comes to coding, Sonnet 4.5 is better at both identifying small improvements and considering larger changes to code, and follows instructions more directly when coding on users’ behalf. 

In data shared with Inc., Anthropic claimed that the new model exhibited state-of-the-art performance across a wide variety of benchmarks. For example, on SWE-Bench Verified, a widely-used benchmark that measures an AI model’s ability to solve real-world software engineering tasks, Sonnet 4.5 was able to successfully solve 77.2 percent of tasks, up from the 74.5 percent solved by Claude Opus 4.1, a larger and much more expensive model released in August. 

AI agents built using Sonnet 4.5 will also be a step up thanks to a new software development kit (SDK) called Claude Agent SDK. The SDK gives developers access to the same agentic tools used by the company’s popular coding agent, Claude Code. These tools enable developers to easily build Sonnet 4.5-based agents that can read and write files, manage context while working on long-running tasks, run code, search the web, pass on context from one agent to another, and coordinate multiple sub-agents to work on tasks simultaneously. 

Sonnet 4.5 is now available through the Claude API and on Claude.ai, Anthropic’s consumer-facing app for its models. The model is also available to use on Claude Code, which many developers access through their computer terminal. 

Separately, Claude Code is getting a visual refresh and a few requested features. The most exciting update for developers will likely be the introduction of checkpoints, which will allow coders (and vibe coders) to roll their apps back to an earlier state if the model introduces a bug or unwanted feature. 

Sonnet 4.5 is also able to run uninterrupted for significantly longer than rival models. When tasked by Anthropic researchers with building an entire application, the model was able to run for over 30 hours without stopping or degrading in performance. In comparison, GPT-5-Codex, OpenAI’s recently-released coding-optimized AI model, was found in testing to work independently for over 7 hours. 

In addition to coding, Anthropic says Sonnet 3.5 has shown significant growth in its ability to help cybersecurity professionals detect, analyze, and remediate vulnerabilities, and is better at financial modeling, research, and forecasting. The model set a new record in FinanceAgent, a benchmark developed by startup Vals that judges an agent’s ability to complete tasks expected of an entry-level financial analyst.

Anthropic is also releasing a new experience for subscribers of its $100 to $200 per month Max tier. The experience, which will only last for five days, is called Imagine with Claude, and places users in a custom, Claude-generated user interface that the model can use to build software in real time. “It’s a fun demonstration showing what Claude Sonnet 4.5 can do,” Anthropic says, “a way to see what’s possible when you combine a capable model with the right infrastructure.” 

Pricing for Claude Sonnett 4.5 is unchanged from the 4.0 model’s price: $3 for every million input tokens processed by the model, and $15 for every million output tokens generated by the model.

[ad_2]

Ben Sherry

Source link