Groq

๐Ÿข Provider Information

Provider Name: Groq
Official Website: https://groq.com
Developer Console: https://console.groq.com
Type: Free Service (Permanently free with usage limits)


๐Ÿ“‹ Product Overview

Groq is a company providing ultra-high-speed AI inference services, based on its self-developed LPU (Language Processing Unit) chip technology, offering the industry’s fastest AI inference speed.

Core Features:

  • โšก Industry’s Fastest Inference Speed - 800+ tokens/s
  • ๐Ÿ”ง LPU Chip Powered - Hardware optimized for language models
  • ๐ŸŽ Ultra-High Free Quota - 14,400 requests/day
  • ๐Ÿ”„ OpenAI API Compatible - Seamlessly switch existing code
  • ๐Ÿš€ Real-time Response - Extremely low latency conversation experience

Rating: โญโญโญโญโญ (Speed King!)


๐Ÿ” Registration and Account

Registration Requirements

Requirement Required Notes
Account Registration โœ… Required Email or Google/GitHub account
Email Verification โœ… Required Need to verify email
Phone Verification โŒ Not Required Usually not needed
Credit Card Binding โœ… Required For identity verification, no charges

Registration Steps

Register Account

Visit https://console.groq.com , click the “Sign Up” button. Choose registration method:

  • Use Google account (recommended, fast)
  • Use GitHub account
  • Use email registration

Verify Email

If using email registration, check your inbox, click the verification link to complete email verification, return to Groq Console.

Verify Identity (Credit Card Binding)

After logging in, the system will prompt you to verify your identity:

  1. Click the “Verify Account” button
  2. Enter credit card information (supports Visa, MasterCard, AmEx, etc.)
  3. โš ๏ธ Important Note: This is only for identity verification, no charges will occur
  4. After successful verification, you can use the free service

Get API Key

  1. Select “API Keys” in the left menu
  2. Click the “Create API Key” button
  3. Name your API key (e.g., “My First Key”)
  4. Click “Submit” to create
  5. โš ๏ธ Important: Immediately copy and save your API key, you won’t be able to view it again

๐ŸŽฏ Provided Services

Groq provides two main services:

1. Playground Service

  • Type: Web conversation interface
  • Access URL: https://console.groq.com/playground
  • Features: Real-time inference speed display, intuitive parameter adjustment
  • Supports: All Groq models

2. API Service

  • Type: RESTful API
  • Features: Fully compatible with OpenAI API format
  • Models: Llama 3.3/3.1, Mixtral, Gemma 2, DeepSeek R1, etc.
  • Quota: 14,400 requests/day

๐Ÿ“Š Quota Overview

Free Tier Quota

Limit Type Quota Notes
Daily Requests 14,400 requests/day Shared across all models
Requests Per Minute 30 requests/min Shared across all models
Daily Tokens 20,000 tokens/day Input + output total
Tokens Per Minute 6,000 tokens/min Input + output total

โš ๏ธ Important Notes:

  • Shared quota: All models share the same account quota
  • Daily reset: Quota resets daily at UTC midnight
  • Token calculation: Both input and output tokens count toward quota

๐Ÿค– Supported Models

Llama Series (Meta)

Model Name Parameters Context Length Use Cases
Llama 3.3 70B 70B 128K Meta’s latest model, powerful performance
Llama 3.1 70B 70B 128K Complex tasks
Llama 3.1 8B 8B 128K Lightweight and efficient

Other Open-Source Models

Model Name Parameters Context Length Features
Mixtral 8x7B 47B 32K Mistral mixture of experts model
Gemma 2 9B 9B 8K Google open-source model
DeepSeek R1 Distill Llama 70B 70B 32K Reasoning expert model

๐ŸŒŸ Core Technical Advantages

LPU Chip Technology

Language Processing Unit:

  • Groq’s self-developed specialized chip
  • Optimized for sequential computation of language models
  • Extremely low latency: over 10x lower than GPUs
  • High throughput: Can achieve 800+ tokens/s generation speed

Speed Comparison

Provider Typical Speed Groq Advantage
Groq 800+ tokens/s Baseline
OpenAI GPT-4 20-40 tokens/s 20x Faster
Anthropic Claude 30-50 tokens/s 16x Faster
Other Cloud Services 50-100 tokens/s 8x Faster

Real-time Application Scenarios

  • Chatbots: Nearly zero-latency conversation experience
  • Code Assistants: Real-time code completion and generation
  • Content Creation: Rapid long-text generation
  • Data Analysis: Real-time data interpretation

โš ๏ธ Usage Notes

Credit Card Verification

  • Although the service is free, credit card binding is required for identity verification
  • This is to prevent abuse, no charges will occur
  • No automatic charges after free quota is exhausted

Quota Management

  • Pay attention to daily and per-minute limits to avoid exceeding quota
  • View quota usage on the Usage page in Console
  • Reasonably allocate quota for different applications

API Key Security

  • Don’t expose API keys in public code repositories
  • Use environment variables or config files to manage keys
  • Regularly rotate API keys

Network Requirements

  • Groq supports most regions globally
  • Mainland China may require stable network environment

๐Ÿ“Š Comparison with Other Services

Feature Groq Google AI Studio OpenRouter
Inference Speed ๐Ÿ† 800+ tokens/s 50-100 tokens/s Varies by provider
Daily Requests 14,400 1,500 50-1,000
Daily Tokens 20K-1M 15M (Flash) Unlimited
Credit Card Required โœ… Verification โŒ โŒ
OpenAI Compatible โœ… Fully Compatible โŒ Not Compatible โœ… Compatible
Multimodal Support โŒ โœ… Some models
Mainland China Access ๐Ÿ”ง Stable Network Required ๐Ÿ”ง VPN Required โœ… Good

๐Ÿ’ก Selection Suggestions

Reasons to Choose Groq

โœ… Highly Recommended:

  • Need extremely fast response speed
  • Building real-time conversation applications
  • High-frequency calls (14,400 times/day)
  • Need OpenAI API compatibility

โŒ Not Suitable For:

  • Need multimodal support (images, audio)
  • Need ultra-long context (>128K)
  • Cannot provide credit card verification

๐Ÿ“ˆ Paid Plans (Optional)

If free quota isn’t enough, Groq offers flexible paid options:

Plan Price Features
Free $0 14,400 req/day
Pay-as-you-go Pay by usage Higher quotas, billed by tokens
Enterprise Custom Dedicated support, SLA guarantee

Pricing Examples:

  • Llama 3.3 70B: ~$0.59/M tokens
  • Llama 3.1 8B: ~$0.05/M tokens

๐Ÿ”— Related Links


๐Ÿ“ Changelog

  • December 2024: Support for DeepSeek R1 Distill series reasoning models
  • November 2024: Released Llama 3.3 70B support
  • October 2024: Increased free tier quota to 14,400 requests/day
  • 2024: Continuously optimizing LPU performance, improving inference speed

๐Ÿ“ง Support & Feedback

Last updated on