Groq
๐ข Provider Information
Provider Name: Groq
Official Website: https://groq.com
Developer Console: https://console.groq.com
Type: Free Service (Permanently free with usage limits)
๐ Product Overview
Groq is a company providing ultra-high-speed AI inference services, based on its self-developed LPU (Language Processing Unit) chip technology, offering the industry’s fastest AI inference speed.
Core Features:
- โก Industry’s Fastest Inference Speed - 800+ tokens/s
- ๐ง LPU Chip Powered - Hardware optimized for language models
- ๐ Ultra-High Free Quota - 14,400 requests/day
- ๐ OpenAI API Compatible - Seamlessly switch existing code
- ๐ Real-time Response - Extremely low latency conversation experience
Rating: โญโญโญโญโญ (Speed King!)
๐ Registration and Account
Registration Requirements
| Requirement | Required | Notes |
|---|---|---|
| Account Registration | โ Required | Email or Google/GitHub account |
| Email Verification | โ Required | Need to verify email |
| Phone Verification | โ Not Required | Usually not needed |
| Credit Card Binding | โ Required | For identity verification, no charges |
Registration Steps
Register Account
Visit https://console.groq.com, click the “Sign Up” button. Choose registration method:
- Use Google account (recommended, fast)
- Use GitHub account
- Use email registration
Verify Email
If using email registration, check your inbox, click the verification link to complete email verification, return to Groq Console.
Verify Identity (Credit Card Binding)
After logging in, the system will prompt you to verify your identity:
- Click the “Verify Account” button
- Enter credit card information (supports Visa, MasterCard, AmEx, etc.)
- โ ๏ธ Important Note: This is only for identity verification, no charges will occur
- After successful verification, you can use the free service
Get API Key
- Select “API Keys” in the left menu
- Click the “Create API Key” button
- Name your API key (e.g., “My First Key”)
- Click “Submit” to create
- โ ๏ธ Important: Immediately copy and save your API key, you won’t be able to view it again
๐ฏ Provided Services
Groq provides two main services:
1. Playground Service
- Type: Web conversation interface
- Access URL: https://console.groq.com/playground
- Features: Real-time inference speed display, intuitive parameter adjustment
- Supports: All Groq models
2. API Service
- Type: RESTful API
- Features: Fully compatible with OpenAI API format
- Models: Llama 3.3/3.1, Mixtral, Gemma 2, DeepSeek R1, etc.
- Quota: 14,400 requests/day
๐ Quota Overview
Free Tier Quota
| Limit Type | Quota | Notes |
|---|---|---|
| Daily Requests | 14,400 requests/day | Shared across all models |
| Requests Per Minute | 30 requests/min | Shared across all models |
| Daily Tokens | 20,000 tokens/day | Input + output total |
| Tokens Per Minute | 6,000 tokens/min | Input + output total |
โ ๏ธ Important Notes:
- Shared quota: All models share the same account quota
- Daily reset: Quota resets daily at UTC midnight
- Token calculation: Both input and output tokens count toward quota
๐ค Supported Models
Llama Series (Meta)
| Model Name | Parameters | Context Length | Use Cases |
|---|---|---|---|
| Llama 3.3 70B | 70B | 128K | Meta’s latest model, powerful performance |
| Llama 3.1 70B | 70B | 128K | Complex tasks |
| Llama 3.1 8B | 8B | 128K | Lightweight and efficient |
Other Open-Source Models
| Model Name | Parameters | Context Length | Features |
|---|---|---|---|
| Mixtral 8x7B | 47B | 32K | Mistral mixture of experts model |
| Gemma 2 9B | 9B | 8K | Google open-source model |
| DeepSeek R1 Distill Llama 70B | 70B | 32K | Reasoning expert model |
๐ Core Technical Advantages
LPU Chip Technology
Language Processing Unit:
- Groq’s self-developed specialized chip
- Optimized for sequential computation of language models
- Extremely low latency: over 10x lower than GPUs
- High throughput: Can achieve 800+ tokens/s generation speed
Speed Comparison
| Provider | Typical Speed | Groq Advantage |
|---|---|---|
| Groq | 800+ tokens/s | Baseline |
| OpenAI GPT-4 | 20-40 tokens/s | 20x Faster |
| Anthropic Claude | 30-50 tokens/s | 16x Faster |
| Other Cloud Services | 50-100 tokens/s | 8x Faster |
Real-time Application Scenarios
- Chatbots: Nearly zero-latency conversation experience
- Code Assistants: Real-time code completion and generation
- Content Creation: Rapid long-text generation
- Data Analysis: Real-time data interpretation
โ ๏ธ Usage Notes
Credit Card Verification
- Although the service is free, credit card binding is required for identity verification
- This is to prevent abuse, no charges will occur
- No automatic charges after free quota is exhausted
Quota Management
- Pay attention to daily and per-minute limits to avoid exceeding quota
- View quota usage on the Usage page in Console
- Reasonably allocate quota for different applications
API Key Security
- Don’t expose API keys in public code repositories
- Use environment variables or config files to manage keys
- Regularly rotate API keys
Network Requirements
- Groq supports most regions globally
- Mainland China may require stable network environment
๐ Comparison with Other Services
| Feature | Groq | Google AI Studio | OpenRouter |
|---|---|---|---|
| Inference Speed | ๐ 800+ tokens/s | 50-100 tokens/s | Varies by provider |
| Daily Requests | 14,400 | 1,500 | 50-1,000 |
| Daily Tokens | 20K-1M | 15M (Flash) | Unlimited |
| Credit Card Required | โ Verification | โ | โ |
| OpenAI Compatible | โ Fully Compatible | โ Not Compatible | โ Compatible |
| Multimodal Support | โ | โ | Some models |
| Mainland China Access | ๐ง Stable Network Required | ๐ง VPN Required | โ Good |
๐ก Selection Suggestions
Reasons to Choose Groq
โ Highly Recommended:
- Need extremely fast response speed
- Building real-time conversation applications
- High-frequency calls (14,400 times/day)
- Need OpenAI API compatibility
โ Not Suitable For:
- Need multimodal support (images, audio)
- Need ultra-long context (>128K)
- Cannot provide credit card verification
๐ Paid Plans (Optional)
If free quota isn’t enough, Groq offers flexible paid options:
| Plan | Price | Features |
|---|---|---|
| Free | $0 | 14,400 req/day |
| Pay-as-you-go | Pay by usage | Higher quotas, billed by tokens |
| Enterprise | Custom | Dedicated support, SLA guarantee |
Pricing Examples:
- Llama 3.3 70B: ~$0.59/M tokens
- Llama 3.1 8B: ~$0.05/M tokens
๐ Related Links
- Official Website: https://groq.com
- Developer Console: https://console.groq.com
- API Documentation: https://console.groq.com/docs
- Python SDK: GitHub - groq-python
- Node.js SDK: GitHub - groq-typescript
- Community Discussion: Discord - Groq
- Status Page: https://status.groq.com
- Blog: https://groq.com/blog
๐ Changelog
- December 2024: Support for DeepSeek R1 Distill series reasoning models
- November 2024: Released Llama 3.3 70B support
- October 2024: Increased free tier quota to 14,400 requests/day
- 2024: Continuously optimizing LPU performance, improving inference speed
๐ง Support & Feedback
- Official Support: [email protected]
- Discord Community: https://discord.gg/groq
- Issue Reporting: https://console.groq.com/support
- Feature Requests: Contact via Discord or email