Anthropic has unveiled Claude Opus 4 and Claude Sonnet 4, two powerful new models that unlock new AI capabilities, particularly for coding and complex reasoning tasks.
Claude Opus 4: The World's Best Coding Model
Claude Opus 4 claims the title of the world's best coding model, achieving impressive benchmark scores, including 72.5% on SWE-bench and 43.2% on Terminal-bench. The model demonstrates sustained performance on long-running tasks, capable of working continuously for several hours—a breakthrough for AI agents tackling complex, multi-step projects.
Major tech companies have already validated Claude Opus 4's capabilities. Cursor calls it "state-of-the-art for coding," while Replit reports "dramatic advancements for complex changes across multiple files." Rakuten successfully tested a demanding 7-hour open-source refactor that ran independently with sustained performance.
Claude Sonnet 4: Enhanced Efficiency and Control
Claude Sonnet 4 significantly improves upon its predecessor, Sonnet 3.7, achieving 72.7% on SWE-bench while maintaining efficiency for everyday use cases. GitHub has selected it to power the new coding agent in GitHub Copilot, highlighting its excellence in autonomous scenarios.
Hybrid Reasoning and New Features
Both models operate as hybrid systems offering two modes: near-instant responses and extended thinking for deeper reasoning. Key innovations include:
- Extended thinking with tool use: Models can alternate between reasoning and tool usage (like web search) to improve responses
- Enhanced memory capabilities: When given file access, Opus 4 can create and maintain memory files for better long-term task awareness
- Parallel tool execution: Both models can use multiple tools simultaneously
- Reduced shortcut behavior: 65% less likely to use loopholes compared to previous models
Claude Code Goes General
As part of the Claude 4 launch, Anthropic also shared that Claude Code is now generally available. Notable features include new VS Code and JetBrains integrations, GitHub Actions support, and an extensible SDK for building custom agents.
Both models maintain consistent pricing with previous generations and are available across multiple platforms, including the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI.
Comments