Groq Llama 4 Scout

Groq Llama 4 Scout: $0.11/1M input tokens, 1100 tok/s throughput, 120ms latency. Compare on BenchNode.io.

$0.11

per 1M input tokens

$0.11 / 1M output

Free tier available

Specifications

model id: llama-4-scout
model family: Llama 4
parameters: 109B MoE
context window tokens: 131,072
modality: text
reasoning: No

Performance

Latency (TTFT) 120 ms

lower is better

Uptime SLA 99.9%

Throughput 1,100 tok/s

120

ms latency

99.9%

uptime

Pricing Detail

input per 1m tokens usd: 0.11
output per 1m tokens usd: 0.11
free tier available: Yes
rate limit tpm: 6,000
rate limit rpm: 30

AI Analysis · gpt-4o-mini

Technical Verdict

With an input and output cost of $0.11 per 1M tokens and a throughput of 1100 tok/s, this pricing is mid-range for the model class, while the throughput is fast. The rate limits of 6,000 TPM and 30 RPM indicate suitability for low-volume prototyping, and the 120 ms first-token latency favors real-time applications over batch processing.