US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc · US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc ·
[728×90 display ad]
AI Inference anthropic-claude-opus-48 ✓ Verified June 4, 2026

Anthropic  Claude Opus 4.8

Anthropic Claude Opus 4.8: $15/1M input tokens, 160 tok/s throughput, 1200ms latency. Compare on BenchNode.io.

$15
per 1M input tokens
$75 / 1M output

Specifications

model id
claude-opus-4.8
model family
Claude 4
parameters
undisclosed
context window tokens
200,000
modality
text + vision
reasoning
No

Performance

Latency (TTFT) 1200 ms
lower is better
Uptime SLA 99.99%
Throughput 160 tok/s
1200
ms latency
99.99%
uptime

Pricing Detail

input per 1m tokens usd
15
output per 1m tokens usd
75
free tier available
No
rate limit tpm
40,000
rate limit rpm
50
batch discount pct
50
context caching input usd
1.5
AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $15.0 / 1M tokens and output cost of $75.0 / 1M tokens indicate a premium pricing model for this class, while the throughput of 160 tok/s is relatively slow. With rate limits of 40,000 TPM and 50 RPM, this tier is more suited for low-volume prototyping rather than high-volume production, and the 1200 ms first-token latency favors batch use over real-time applications.

Ideal Use Case

This tier is suitable for a batch document processing workload with a team of 5-10 engineers, requiring >500k daily requests and tolerating higher latency.

More AI Inference Configs

Ready to start building?
Anthropic
Pay-as-you-go · 50% batch discount
[300×250 display ad — EthicalAds / Carbon Ads]
← All providers VS comparisons → Data methodology →