US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc · US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc ·
[728×90 display ad]
AI Inference openai-gpt4o-mini ✓ Verified June 4, 2026

OpenAI  GPT-4o mini

OpenAI GPT-4o mini: $0.15/1M input tokens, 550 tok/s throughput, 400ms latency. Compare on BenchNode.io.

$0.15
per 1M input tokens
$0.6 / 1M output

Specifications

model id
gpt-4o-mini
model family
GPT-4o
parameters
undisclosed
context window tokens
128,000
modality
text + vision
reasoning
No

Performance

Latency (TTFT) 400 ms
lower is better
Uptime SLA 99.99%
Throughput 550 tok/s
400
ms latency
99.99%
uptime

Pricing Detail

input per 1m tokens usd
0.15
output per 1m tokens usd
0.6
free tier available
No
rate limit tpm
2,000,000
rate limit rpm
500
batch discount pct
50
context caching input usd
0.075
AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $0.15 / 1M tokens and output cost of $0.6 / 1M tokens, combined with a throughput of 550 tok/s, positions this API tier as mid-range in pricing with fast throughput. The rate limits of 2,000,000 TPM and 500 RPM suggest suitability for high-volume production environments, while the 400 ms first-token latency favors batch processing over real-time applications.

Ideal Use Case

This API tier is suitable for a batch document processing pipeline requiring >500k daily requests, where teams prioritize cost efficiency and can leverage the 50% discount on async batch jobs.

More AI Inference Configs

Ready to start building?
OpenAI
$18 free API credits
[300×250 display ad — EthicalAds / Carbon Ads]
← All providers VS comparisons → Data methodology →