Claude Opus 4.7 Was Anthropic's Best Model

Anthropic's Claude Opus 4.7 launched on April 16, 2026, to some of the strongest benchmark numbers the company had ever shipped. Six weeks later, it was already obsolete.

On May 28, Anthropic released Claude Opus 4.8, making Opus 4.7 the shortest-lived flagship model in the company's history — dethroned just 42 days after it first went live. That's the fastest Opus-to-Opus turnover Anthropic has ever shipped, and it says something about how quickly the frontier is moving right now.

What Opus 4.7 Actually Delivered

The April release wasn't just a minor tick-up. Opus 4.7 posted real gains: SWE-bench Verified jumped from 80.8% to 87.6%, CursorBench climbed from 58% to 70%, and image resolution support tripled compared to Opus 4.6. On Anthropic's internal 93-task coding suite, it solved 13% more problems than its predecessor — including four that neither Opus 4.6 nor Sonnet 4.6 could handle at all.

The model also introduced something new to the Opus line: the ability to verify its own outputs before responding. Rather than surfacing a confident answer and moving on, Opus 4.7 could catch its own mistakes mid-task, a behavior that proved especially useful in long, multi-step agentic workflows.

The launch came at a pointed moment for Anthropic. In the weeks before April 16, user frustration with Opus 4.6 had spilled into public forums, with complaints that Claude had quietly gotten worse. A senior director at AMD posted on GitHub that "Claude has regressed to the point it cannot be trusted to perform complex engineering." Anthropic denied any intentional degradation, and Opus 4.7 landed squarely as the company's answer — retaking the lead over ChatGPT 5.4 and Gemini 3.1 Pro across key benchmarks.

Enter Opus 4.8

The successor arrived less than six weeks later with no fanfare buildup and a notably understated pitch. Anthropic described Opus 4.8 as "a modest but tangible improvement on its predecessor" — language that's unusual for a launch post, but accurate.

The clearest upgrade is around honesty and reliability. Opus 4.8 is roughly four times less likely than 4.7 to let flaws in its own code go unreported. On one internal evaluation, it became the first Claude model to score 0% on uncritically accepting flawed outputs — meaning it consistently flagged problems rather than quietly passing them along. For teams running production-grade agentic code pipelines, that's a meaningful shift.

Pricing held flat at $5/$25 per million tokens. A new Fast Mode runs at roughly 2.5× the speed at $10/$50. Alongside the model, Anthropic shipped Dynamic Workflows in Claude Code — a research preview that lets the system spin up and coordinate parallel subagents to tackle larger-scale tasks — and added effort-level controls to the claude.ai interface for all plan tiers.

A Sign of the Times

What's notable isn't just that Opus 4.7 had a short run — it's how short. Previous Opus releases had been separated by months. The 10-week window between Opus 4.6 and 4.7 was already quicker than usual. Forty-two days is another level entirely.

Opus 4.7 was, by every available measure, the strongest publicly available AI model in the world for most of its lifespan. It's also already been superseded. In the current pace of AI development, that's not an anomaly — it's just the way things work now.