US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc · US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc ·
[728×90 display ad]
AI Inference openai-gpt4o ✓ Verified June 4, 2026

OpenAI  GPT-4o

OpenAI GPT-4o: $2.5/1M input tokens, 320 tok/s throughput, 600ms latency. Compare on BenchNode.io.

$2.5
per 1M input tokens
$10 / 1M output

Specifications

model id
gpt-4o
model family
GPT-4o
parameters
undisclosed
context window tokens
128,000
modality
text + vision
reasoning
No

Performance

Latency (TTFT) 600 ms
lower is better
Uptime SLA 99.99%
Throughput 320 tok/s
600
ms latency
99.99%
uptime

Pricing Detail

input per 1m tokens usd
2.5
output per 1m tokens usd
10
free tier available
No
rate limit tpm
800,000
rate limit rpm
500
batch discount pct
50
context caching input usd
1.25
AI Analysis · gpt-4o-mini

Technical Verdict

The input cost is $2.5 / 1M tokens and the output cost is $10.0 / 1M tokens, indicating a premium price point for this model class, while the throughput of 320 tok/s is fast, benefiting bulk async jobs with a 50% discount on batch pricing and $1.25/1M cached input tokens for high prompt-reuse workloads. The rate limits of 800,000 TPM and 500 RPM are suited for high-volume production, while the first-token latency of 600 ms favors batch use over real-time applications.

Ideal Use Case

This tier is suitable for a batch document processing pipeline requiring >500k daily requests with moderate latency tolerance.

More AI Inference Configs

Ready to start building?
OpenAI
$18 free API credits
[300×250 display ad — EthicalAds / Carbon Ads]
← All providers VS comparisons → Data methodology →