Anthropic's Claude Opus 4.8 is here with 3X cheaper fast mode and near-Mythos level alignment

May 28, 2026 - 22:13

0 4

Anthropic's Claude Opus 4.8 is here with 3X cheaper fast mode and near-Mythos level alignment

Anthropic today released Claude Opus 4.8, an upgrade to its flagship model that ships at the same price as its predecessor, alongside a dramatically cheaper "fast mode" tier and a new feature that lets the model spawn hundreds of parallel subagents for codebase-scale work.

The model is available immediately across Anthropic's surfaces — claude.ai, Claude Code, the API, and Cowork — at unchanged pricing: $5 per million input tokens and $25 per million output tokens. Developers can call it as claude-opus-4-8.

The headline efficiency story is fast mode. Anthropic has slashed the price of running Opus 4.8 in fast mode — where the model produces tokens at roughly 2.5x normal speed — to $10 per million input tokens and $50 per million output tokens, down from $30/$150 for Opus 4.7

Claude Opus 4.8 and 4.7 fast mode pricing chart

That's a 3X reduction from the fast-mode pricing of previous models, and brings high-throughput inference within reach of latency-sensitive production workloads.

Fast mode is available immediately in Claude Code via the /fast command; API access is gated, with a waitlist at claude.com/fast-mode.

In regular mode, Claude Opus 4.8 remains among the more expensive of leading frontier models, but still comes in under chief rival OpenAI's GPT-5.5.

Frontier AI Model API Pricing Snapshot

Model	Input	Output	Total Cost	Source
MiMo-V2.5 Flash	$0.10	$0.30	$0.40	Xiaomi MiMo
MiniMax M2.7	$0.30	$1.20	$1.50	MiniMax
Gemini 3.1 Flash-Lite	$0.25	$1.50	$1.75	Google
MiMo-V2.5	$0.40	$2.00	$2.40	Xiaomi MiMo
Kimi-K2.6	$0.95	$4.00	$4.95	Moonshot/Kimi
GLM-5	$1.00	$3.20	$4.20	Z.ai
Grok 4.3 (low context)	$1.25	$2.50	$3.75	xAI
DeepSeek V4 Pro	$1.74	$3.48	$5.22	DeepSeek
GLM-5.1	$1.40	$4.40	$5.80	Z.ai
Claude Haiku 4.5	$1.00	$5.00	$6.00	Anthropic
Grok 4.3 (high context)	$2.50	$5.00	$7.50	xAI
Qwen3.7-Max	$2.50	$7.50	$10.00	Alibaba Cloud
Gemini 3.5 Flash	$1.50	$9.00	$10.50	Google
Gemini 3.1 Pro Preview (≤200K)	$2.00	$12.00	$14.00	Google
GPT-5.4	$2.50	$15.00	$17.50	OpenAI
Gemini 3.1 Pro Preview (>200K)	$4.00	$18.00	$22.00	Google
Claude Opus 4.7	$5.00	$25.00	$30.00	Anthropic
Claude Opus 4.8	$5.00	$25.00	$30.00	Anthropic
GPT-5.5	$5.00	$30.00	$35.00	OpenAI

Modest gains over 4.7, but Mythos-class capabilities coming

On benchmarks, Opus 4.8 is a step up rather than a leap. It scores 88.6% on SWE-bench Verified (vs. 87.6% for Opus 4.7), 69.2% on the harder SWE-bench Pro (vs. 64.3%), and 74.6% on Terminal-Bench 2.1 (vs. 66.1%). Anthropic itself characterizes the model as "a modest but tangible improvement on its predecessor."

Anthropic Claude Opus 4.8 benchmark comparison chart

It beats GPT-5.5 regular across at least 12 benchmarks, including most knowledge-work, coding (issue-level), agentic tool-use, and long-context benchmarks. GPT-5.5 wins on terminal/CLI workflows and is roughly tied on web browsing and graduate-level science.

The bigger signal sits in Anthropic's internal capability ladder: Opus 4.8 lands between Opus 4.7 and the more capable Claude Mythos Preview, which is currently restricted to a small number of organizations under Project Glasswing for cybersecurity work.

Anthropic says it expects to bring "Mythos-class models to all our customers in the coming weeks" once additional cyber safeguards are in place.

Several enterprise partners cited material gains. Databricks reported that Opus 4.8 unlocks "a step change in agentic reasoning" inside its Genie data agent, at "61% cheaper token cost than Opus 4.7" thanks to multimodal efficiency on PDFs and diagrams.

Hebbia cited better citation precision and token efficiency on dense financial filings. Devin-maker Cognition said the release "translates directly into faster capability gains for engineers" and noted Opus 4.8 fixed comment-verbosity and tool-calling issues from 4.7. A computer-use vendor reported 84% on Online-Mind2Web, a jump over both Opus 4.7 and GPT-5.5.

Dynamic workflows: hundreds of parallel subagents

Alongside the model, Anthropic launched a research preview of dynamic workflows in Claude Code — a feature designed for tasks too large for a single context window. Claude plans the work, spawns hundreds of parallel subagents, then verifies its own outputs before reporting back. Anthropic's example: a codebase-scale migration "across hundreds of thousands of lines of code from kickoff to merge, with the existing test suite as its bar."

Dynamic workflows is available on Claude Code's Enterprise, Team, and Max plans.

Two smaller additions round out the release:

Effort control on claude.ai and Claude Cowork: A new selector lets users dial how much thinking Claude does per response — higher effort spends more tokens for better answers, lower effort responds faster and burns rate limits more slowly. Available on all plans.
System entries inside the messages array on the API: Developers can now update Claude's instructions mid-task — adjusting permissions, token budgets, or environment context as an agent runs — without breaking the prompt cache.

Honesty, and an "evaluation awareness" caveat

Anthropic is leading with honesty as a headline trait. The company's alignment team reports Opus 4.8 is "around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked," and that misaligned behavior rates are now "substantially lower than Opus 4.7, and similar to our best-aligned model, Claude Mythos Preview."

Indeed, a bar chart released by Anthropic shows how close Opus 4.8 is to the still selectively released Mythos in terms of its misalignment (a lower score is better), coming in at roughly 1.9, down from 2.5 for Opus 4.7 and effectively tied with the more capable, restricted Mythos Preview. The score is based on roughly 2,600 simulated investigation sessions per model.

Anthropic Claude Opus 4.8 misalignment bar chart

The 244-page system card publicly released by Anthropic also goes into greater detail on specific categories of misalignment — whether a model produces potentially harmful content around "military-grade weapons," "harmful sexual content", "disallowed cyberoffense", and "undermining liberal democracy," and again, across all of them, Opus 4.8 scores markedly better than 4.7 or Sonnet 4.6, and comes quite close to Mythos.

Anthropic flags one finding it considers "the most concerning" from training: Opus 4.8 shows a growing tendency to reason explicitly about how its outputs will be graded, including in environments where it wasn't told it was being evaluated. In other words: the model knows it is likely being graded, and produces a response it thinks will earn it a good grade on the test, not one it would necessarily produce if it thought it wasn't being graded.

Anthropic says this didn't translate into worse observable behavior — Opus 4.8 shows fewer misleading task-success claims than prior models — but calls it "a concerning trend that could complicate training in the future." Preliminary interpretability work also found unverbalized grader-related reasoning in roughly 5% of training episodes.

Anthropic ran the model through a one-week live bug bounty for prompt injection — a first — and concluded Opus 4.8 sits between Opus 4.7 and Sonnet 4.6 on robustness, ahead of "all comparable frontier models" tested, with deployed safeguards bringing browser-use attack success rates to near zero.

What's next?

Anthropic teased two trajectories. Near-term: cheaper models that provide "many of the same capabilities as Opus." Longer-term: the Mythos-class models, which the company says represent higher intelligence than Opus but require stronger cyber safeguards before general release.

For now, Opus 4.8 is positioned as the new go-to enterprise and development workhorse — slightly smarter than 4.7, dramatically cheaper to run fast, and noticeably more honest about what it doesn't know.