← All Providers
Groq
LPU (Language Processing Unit) inference provider delivering the fastest inference speeds in the industry — up to 1000+ tokens/second. Linear pricing with no hidden costs.
Visit Groq →Plans & Pricing
Free Tier
$0- ✓All supported open models
- ✓Rate limits apply
- ✓Prompt caching available
- ✓Batch API: 50% discount
API Pay-Per-Use
From $0.05/1M tokens- ✓Llama 3.1 8B Instant: $0.05/1M in / $0.08/1M out (840 TPS)
- ✓Llama 4 Scout: $0.11/1M in / $0.34/1M out (594 TPS)
- ✓GPT OSS 20B: $0.075/1M in / $0.30/1M out (1000 TPS)
- ✓Qwen3 32B: $0.29/1M in / $0.59/1M out (662 TPS)
- ✓Up to 1000+ TPS
Free Tier
Free tier available with rate limits. No idle infrastructure costs.
Models (7)
Compare →| Model | Context | In /1M | Out /1M | Capabilities |
|---|---|---|---|---|
Llama 4 Scout (Groq) ChatFunctions | 128K | $0.11 | $0.34 | ChatFunctionsStreaming |
Qwen3 32B (Groq) ChatFunctions | 131K | $0.29 | $0.59 | ChatFunctionsStreaming |
Llama 3.3 70B Versatile (Groq) ChatFunctions | 128K | $0.59 | $0.79 | ChatFunctionsStreaming |
Llama 3.1 8B Instant (Groq) ChatFunctions | 128K | $0.05 | $0.08 | ChatFunctionsStreaming |
GPT OSS 20B (Groq) ChatFunctions | 128K | $0.07 | $0.30 | ChatFunctionsStreaming |
GPT OSS 120B (Groq) ChatFunctions | 128K | $0.15 | $0.60 | ChatFunctionsStreaming |
Kimi K2 (Groq) ChatFunctions | 128K | $1.00 | $3.00 | ChatFunctionsVision+1 |