Claude 3.7 Sonnet: First AI Model To Combine Quick Answers With Extended Thinking

Wait 5 sec.

TLDRAnthropic launched Claude 3.7 Sonnet, a “hybrid AI reasoning model” that can provide both real-time answers and extended “thinking” responsesThe model outperforms competitors on coding benchmarks with 62.3% accuracy on SWE-Bench compared to OpenAI’s o3-mini (49.3%)Only premium Claude subscribers get access to the reasoning features, while free users get the standard versionAnthropic also introduced Claude Code, an agentic coding tool available as a research previewUsers report impressive results, with some claiming the model completed complex coding projects quicklyAnthropic has released Claude 3.7 Sonnet, a new AI model that combines quick responses with the ability to “think” more deeply about complex questions. The model, launched on February 24, 2025, is what Anthropic calls the industry’s first “hybrid AI reasoning model.”Claude 3.7 Sonnet can give both real-time answers and more thought-out responses based on user preference. Users can choose to activate the model’s reasoning abilities, which allow it to think for short or long periods before answering.This new approach is part of Anthropic’s effort to make AI products easier to use. Most AI chatbots today force users to pick from different models with varying costs and abilities. Anthropic wants to simplify this with a single model that handles all tasks.The model is now available to all users and developers. However, only paying customers with premium Claude plans will get access to the reasoning features. Free users will get the standard version of Claude 3.7 Sonnet without the thinking capabilities.Pricing for Claude 3.7 Sonnet is set at $3 per million input tokens and $15 per million output tokens. This makes it more expensive than OpenAI’s o3-mini ($1.10/$4.40) and DeepSeek’s R1 ($0.55/$2.19). But unlike those models, which are solely reasoning models, Claude 3.7 Sonnet combines both approaches.Reasoning ModelsAnthropic’s model breaks down from other reasoning models in the market. Other reasoning models like o3-mini, R1, Google’s Gemini 2.0 Flash Thinking, and xAI’s Grok 3 (Think) use more computing power before answering questions. They break problems down into smaller steps to improve accuracy.In the future, Anthropic plans for Claude to decide on its own how long it should think about questions. Dianne Penn, Anthropic’s product and research lead, told TechCrunch they want a smooth experience where the model integrates reasoning with other capabilities rather than keeping them separate.A key feature of Claude 3.7 Sonnet is its “visible scratch pad.” This allows users to see Claude’s thinking process for most prompts. Some portions may be hidden for safety reasons, but users can generally follow how the AI reaches its conclusions.The thinking modes were made for real-world tasks like hard coding problems. Developers using Anthropic’s API can control how much thinking time is allowed, balancing speed and cost against quality.Industry Test PerformanceClaude 3.7 Sonnet performs well on industry tests. On SWE-Bench, which tests coding tasks, it reached 62.3% accuracy compared to OpenAI’s o3-mini at 49.3%. On TAU-Bench, which tests retail interactions, it scored 81.2% versus OpenAI’s o1 model at 73.5%.The new model also refuses to answer questions less often than earlier versions. Anthropic claims they reduced unnecessary refusals by 45% compared to Claude 3.5 Sonnet. This comes as AI companies reconsider how to handle restricted content.Alongside Claude 3.7 Sonnet, Anthropic announced Claude Code. This new tool lets developers run tasks through Claude directly from their terminal. It’s currently available as a research preview to a limited number of users on a first-come, first-served basis.In demonstrations, Anthropic showed how Claude Code can analyze projects with simple commands like “Explain this project structure.” Developers can modify code using everyday language, and Claude Code describes changes as it makes them, tests for errors, and can push to GitHub.Users are reporting impressive results with the new model. On Reddit, some users claim Claude 3.7 Sonnet solved coding problems that stumped other AI models. One user wrote that it created “an entire project I had been working on for months—5000 lines of code, front-end, debugging example, all from scratch.”Benchmark tests show Claude 3.7 Sonnet leading in most categories. Its extended thinking mode improves accuracy on math and science tasks, outperforming both OpenAI and DeepSeek models.This release comes during a time of rapid development in AI. While Anthropic has typically taken a careful, safety-focused approach to releasing models, they appear to be pushing ahead of competitors with this release.The post Claude 3.7 Sonnet: First AI Model To Combine Quick Answers With Extended Thinking appeared first on Blockonomi.