OpenAI GPT-4o mini

OpenAI GPT-4o mini: $0.15/1M input tokens, 550 tok/s throughput, 400ms latency. Compare on BenchNode.io.

$0.15

per 1M input tokens

$0.6 / 1M output

Specifications

model id: gpt-4o-mini
model family: GPT-4o
parameters: undisclosed
context window tokens: 128,000
modality: text + vision
reasoning: No

Performance

Latency (TTFT) 400 ms

lower is better

Uptime SLA 99.99%

Throughput 550 tok/s

400

ms latency

99.99%

uptime

Pricing Detail

input per 1m tokens usd: 0.15
output per 1m tokens usd: 0.6
free tier available: No
rate limit tpm: 2,000,000
rate limit rpm: 500
batch discount pct: 50
context caching input usd: 0.075

AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $0.15 / 1M tokens and output cost of $0.6 / 1M tokens, combined with a throughput of 550 tok/s, positions this API tier as mid-range in pricing with fast throughput. The rate limits of 2,000,000 TPM and 500 RPM suggest suitability for high-volume production environments, while the 400 ms first-token latency favors batch processing over real-time applications.

Ideal Use Case

This API tier is suitable for a batch document processing pipeline requiring >500k daily requests, where teams prioritize cost efficiency and can leverage the 50% discount on async batch jobs.