Question 1

What was Claude Sonnet 4.5's OSWorld score and why does it matter?

Accepted Answer

61.4%, up from Sonnet 4's 42.2% four months earlier. OSWorld measures AI performance on real-world computer tasks: navigating software, filling forms, and clicking UI elements. It focuses on operational computer-use scenarios rather than abstract reasoning alone.

Question 2

How long can Claude Sonnet 4.5 maintain focus on a single agentic coding task?

Accepted Answer

More than 30 hours on complex, multi-step tasks. Anthropic noted this duration changes what's architecturally feasible for autonomous engineering work. Individual results vary by task structure.

Question 3

What is ASL-3 and why does it apply to Sonnet 4.5?

Accepted Answer

ASL-3 (AI Safety Level 3) is Anthropic's framework level for models requiring additional safeguards. Sonnet 4.5 is the first Claude model released under ASL-3 protections, which include classifiers screening inputs and outputs for CBRN-related content. False positive rates have decreased by a factor of 10 since initial deployment.

Question 4

What is the Claude Agent SDK and how does it relate to this model?

Accepted Answer

The Claude Agent SDK launched alongside Sonnet 4.5. It gives you access to the same agent infrastructure that powers Claude Code: memory management across long tasks, permission systems, and subagent coordination. Use it to build custom agents on the same foundation.

Question 5

What alignment improvements came with Sonnet 4.5?

Accepted Answer

Substantial reductions in sycophancy, deception, power-seeking, encouragement of delusional thinking, and compliance with harmful system prompts, measured via an automated behavioral auditor. The model also improved defenses against prompt injection attacks for computer use and agentic capabilities.

Question 6

Why did specialists in finance, law, medicine, and STEM find Sonnet 4.5 significantly better than previous models?

Accepted Answer

Professionals assessed domain-specific knowledge and reasoning in Anthropic's expert evaluations. Results showed substantially better performance compared to older models, including Opus 4.1. The intelligence improvements extend beyond coding benchmarks.

Question 7

Is Sonnet 4.5 priced differently from Sonnet 4?

Accepted Answer

Current pricing is shown on this page. AI Gateway routes across providers, and rates may vary by provider.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Claude Sonnet 4.5

Frequently Asked Questions