Anthropic Claude Haiku 4.5

Anthropic Claude Haiku 4.5: $0.8/1M input tokens, 600 tok/s throughput, 400ms latency. Compare on BenchNode.io.

$0.8

per 1M input tokens

$4 / 1M output

Specifications

model id: claude-haiku-4.5
model family: Claude 4
parameters: undisclosed
context window tokens: 200,000
modality: text + vision
reasoning: No

Performance

Latency (TTFT) 400 ms

lower is better

Uptime SLA 99.99%

Throughput 600 tok/s

400

ms latency

99.99%

uptime

Pricing Detail

input per 1m tokens usd: 0.8
output per 1m tokens usd: 4
free tier available: No
rate limit tpm: 200,000
rate limit rpm: 1,000
batch discount pct: 50
context caching input usd: 0.08

AI Analysis · gpt-4o-mini

Technical Verdict

The input cost of $0.8 / 1M tokens and output cost of $4.0 / 1M tokens indicate a premium pricing model for this class, while the throughput of 600 tok/s is fast, benefiting bulk async jobs with a 50% discount on batch pricing and $0.08/1M cached input tokens for high prompt-reuse workloads. The rate limits of 200,000 TPM and 1000 RPM suggest suitability for high-volume production, while the first-token latency of 400 ms favors batch use over real-time applications.

Ideal Use Case

This tier is suitable for a high-volume production workload involving batch document processing with >500k daily requests in a data pipeline architecture.