Groq Llama 4 Maverick

Groq Llama 4 Maverick: $0.2/1M input tokens, 900 tok/s throughput, 180ms latency. Compare on BenchNode.io.

$0.2

per 1M input tokens

$0.2 / 1M output

Free tier available

Specifications

model id: llama-4-maverick
model family: Llama 4
parameters: 400B MoE
context window tokens: 1,000,000
modality: text
reasoning: No

Performance

Latency (TTFT) 180 ms

lower is better

Uptime SLA 99.9%

Throughput 900 tok/s

180

ms latency

99.9%

uptime

Pricing Detail

input per 1m tokens usd: 0.2
output per 1m tokens usd: 0.2
free tier available: Yes
rate limit tpm: 6,000
rate limit rpm: 30

AI Analysis · gpt-4o-mini

Technical Verdict

With input and output costs of $0.2 per 1M tokens and a throughput of 900 tok/s, this pricing is mid-tier for the model class, while the throughput is fast. The rate limits of 6,000 TPM and 30 RPM suggest suitability for low-volume prototyping, and the first-token latency of 180 ms favors real-time applications like chat or streaming rather than batch processing.