US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc · US-East 42ms p50 · EU-Central 38ms p50 · AP-South 71ms p50 · ETH Block #21.4M head · SOL Block #312.1M head · Groq 750 tok/s · Together 600 tok/s · Alchemy 25ms rpc ·
[728×90 display ad]
AI Inference openai-o3-mini ✓ Verified June 4, 2026

OpenAI  o3-mini

OpenAI o3-mini: $1.1/1M input tokens, 280 tok/s throughput, 900ms latency. Compare on BenchNode.io.

$1.1
per 1M input tokens
$4.4 / 1M output

Specifications

model id
o3-mini
model family
o3
parameters
undisclosed
context window tokens
200,000
modality
text
reasoning
Yes

Performance

Latency (TTFT) 900 ms
lower is better
Uptime SLA 99.99%
Throughput 280 tok/s
900
ms latency
99.99%
uptime

Pricing Detail

input per 1m tokens usd
1.1
output per 1m tokens usd
4.4
free tier available
No
rate limit tpm
500,000
rate limit rpm
500
batch discount pct
50
AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $1.1 / 1M tokens and output cost of $4.4 / 1M tokens position this API tier at a mid-range price point, with a throughput of 280 tok/s indicating a slow performance for high-demand applications; the 50% discount on async batch jobs benefits bulk async jobs. The rate limits of 500,000 TPM and 500 RPM suggest suitability for low-volume prototyping, while the first-token latency of 900 ms leans towards batch use rather than real-time applications.

Ideal Use Case

Batch document processing for a team of 3-5 engineers needing >500k daily requests with moderate latency tolerance.

More AI Inference Configs

Ready to start building?
OpenAI
$18 free API credits
[300×250 display ad — EthicalAds / Carbon Ads]
← All providers VS comparisons → Data methodology →