Google Gemini 2.5 Flash

Google Gemini 2.5 Flash: $0.075/1M input tokens, 580 tok/s throughput, 350ms latency. Compare on BenchNode.io.

$0.075

per 1M input tokens

$0.3 / 1M output

Free tier available

Specifications

model id: gemini-2.5-flash
model family: Gemini 2.5
parameters: undisclosed
context window tokens: 1,000,000
modality: text + vision + audio
reasoning: No

Performance

Latency (TTFT) 350 ms

lower is better

Uptime SLA 99.95%

Throughput 580 tok/s

350

ms latency

99.95%

uptime

Pricing Detail

input per 1m tokens usd: 0.075
output per 1m tokens usd: 0.3
free tier available: Yes
rate limit tpm: 1,000,000
rate limit rpm: 2,000
context caching input usd: 0.019

AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $0.075 / 1M tokens and output cost of $0.3 / 1M tokens position this model in the mid-price range for its class, with a throughput of 580 tok/s indicating a fast processing capability. The rate limits of 1,000,000 TPM and 2000 RPM suggest suitability for high-volume production environments, while the first-token latency of 350 ms leans towards batch use rather than real-time applications.