Baidu Qianfan API - Permanently Free Chinese AI API Service

๐Ÿ“‹ Service Information

Provider: Baidu
Service Type: API Service
API Endpoint: https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/
Free Type: Permanently Free (some models) + New User Gift
API Compatibility: Supports OpenAI SDK (adjust base_url and api_key)


๐ŸŽฏ Service Overview

Baidu Qianfan LLM Platform provides powerful AI API services, featuring permanently free high-performance models, fully compatible with OpenAI SDK, offering zero-cost AI capabilities for developers.

Core Advantages:

  • ๐ŸŽ Permanently Free Models - ERNIE-3.5-8K, ERNIE-Speed-8K permanently free
  • ๐Ÿ’ฐ Excellent Value - Free models outperform GPT-3.5 Turbo
  • ๐Ÿ”„ OpenAI Compatible - Supports OpenAI SDK, one-line migration
  • ๐Ÿ‡จ๐Ÿ‡ณ Chinese Optimized - Optimized for Chinese, top performance
  • ๐Ÿš€ Fast Domestic Access - Servers in China, quick response
  • ๐ŸŽ New User Gift - ERNIE-4.0-8K 1M tokens/month (first month)
  • ๐Ÿ”’ Compliant for Business - Domestic compliance, safe for commercial use

๐Ÿš€ Quick Start

Prerequisites

Required:

  • โœ… Registered Baidu Cloud account
  • โœ… Completed real-name verification
  • โœ… Created application and obtained API Key

For detailed steps, see: Baidu Registration Guide

5-Minute Quick Example

Using OpenAI SDK (Recommended)

Python
from openai import OpenAI

# Configure Baidu Qianfan (using OpenAI SDK)
client = OpenAI(
    api_key="YOUR_BAIDU_API_KEY",
    base_url="https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop"
)

# Call permanently free ERNIE-3.5-8K
response = client.chat.completions.create(
    model="ernie-3.5-8k",
    messages=[
        {"role": "system", "content": "You are a helpful assistant"},
        {"role": "user", "content": "Introduce quantum computing"}
    ]
)

print(response.choices[0].message.content)

๐Ÿค– Supported Models

Permanently Free Models

Model IDContextFeaturesQPSPrice
ernie-3.5-8k8K๐Ÿ† Exceeds GPT-3.550Permanently Free
ernie-speed-8k8Kโšก Ultra-fast50Permanently Free

New User Gift

Model IDContextFeaturesQPSFree Quota
ernie-4.0-8k8K๐Ÿ†• Latest flagship51M tokens/month (first month)

Paid Models

Model IDContextFeaturesPrice (input/output)
ernie-4.0-turbo-8k8KHigh-performance flagshipยฅ12/M / ยฅ12/M
ernie-4.0-turbo-128k128KUltra-long contextยฅ20/M / ยฅ20/M
ernie-lite-8k8KLightweight & fastยฅ0/M / ยฅ0/M (trial)

โš ๏ธ Important Notes:

  • ERNIE-3.5-8K and ERNIE-Speed-8K are permanently free with unlimited tokens
  • QPS Limit: Maximum requests per second (can be increased with paid plans)
  • New users receive 1M tokens of ERNIE-4.0-8K in first month
  • For detailed pricing, see Official Pricing

๐Ÿ”ข Quotas and Limits

Permanently Free Model Limits

LimitERNIE-3.5-8KERNIE-Speed-8K
Token LimitUnlimitedUnlimited
QPS5050
Concurrent5050
Context8K tokens8K tokens
DurationPermanentPermanent
Quota Note: The above quotas are reference values. Please refer to the Console for actual quotas.

New User Gift Limits

LimitERNIE-4.0-8K
Free Quota1M tokens/month
DurationFirst month after registration
QPS5

Quota Query

  • Real-time quota check: Console Quota Page
  • Usage statistics: Console โ†’ Qianfan Platform โ†’ Usage Statistics

๐Ÿ’ป Code Examples

1. Basic Conversation (OpenAI SDK)

Python
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_BAIDU_API_KEY",
    base_url="https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop"
)

# Use permanently free ERNIE-3.5
response = client.chat.completions.create(
    model="ernie-3.5-8k",
    messages=[
        {"role": "user", "content": "What is artificial intelligence?"}
    ]
)

print(response.choices[0].message.content)

2. Streaming Output

Python
# Streaming output (for real-time display)
stream = client.chat.completions.create(
    model="ernie-3.5-8k",
    messages=[
        {"role": "user", "content": "Write an article about AI"}
    ],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

3. Multi-turn Conversation

Python
messages = [
    {"role": "system", "content": "You are a Python programming assistant"}
]

# First round
messages.append({"role": "user", "content": "How to read CSV files?"})
response = client.chat.completions.create(
    model="ernie-3.5-8k",
    messages=messages
)
messages.append({
    "role": "assistant", 
    "content": response.choices[0].message.content
})

# Second round
messages.append({"role": "user", "content": "How to handle missing values?"})
response = client.chat.completions.create(
    model="ernie-3.5-8k",
    messages=messages
)
print(response.choices[0].message.content)

4. Using Official SDK

Python
# Install official SDK
# pip install qianfan

import qianfan

# Initialize with API Key and Secret Key
chat_comp = qianfan.ChatCompletion(
    api_key="YOUR_API_KEY",
    secret_key="YOUR_SECRET_KEY"
)

# Call ERNIE-3.5-8K
resp = chat_comp.do(
    model="ERNIE-3.5-8K",
    messages=[{
        "role": "user",
        "content": "Hello, please introduce yourself"
    }]
)

print(resp["result"])

5. cURL Example

Bash
# Get access_token
curl -X POST \
  'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id=YOUR_API_KEY&client_secret=YOUR_SECRET_KEY'

# Call API (using obtained access_token)
curl -X POST \
  'https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop/chat/ernie-3.5-8k?access_token=YOUR_ACCESS_TOKEN' \
  -H 'Content-Type: application/json' \
  -d '{
    "messages": [
      {
        "role": "user",
        "content": "Hello, please introduce yourself"
      }
    ]
  }'

๐Ÿ’ก Best Practices

โœ… Recommended Approaches

  1. Choose the Right Model

    # General tasks, cost priority โ†’ ERNIE-3.5-8K (permanently free)
    model = "ernie-3.5-8k"
    
    # Speed priority โ†’ ERNIE-Speed-8K (permanently free)
    model = "ernie-speed-8k"
    
    # Quality priority โ†’ ERNIE-4.0-8K (free for new users)
    model = "ernie-4.0-8k"
  2. Error Handling and Retry

    import time
    from openai import OpenAI, APIError
    
    def call_with_retry(messages, max_retries=3):
        for i in range(max_retries):
            try:
                return client.chat.completions.create(
                    model="ernie-3.5-8k",
                    messages=messages
                )
            except APIError as e:
                if i < max_retries - 1:
                    wait_time = 2 ** i
                    print(f"API error, waiting {wait_time} seconds...")
                    time.sleep(wait_time)
                else:
                    raise
  3. Secure Key Management

    import os
    from dotenv import load_dotenv
    
    load_dotenv()
    api_key = os.getenv('BAIDU_API_KEY')
    secret_key = os.getenv('BAIDU_SECRET_KEY')
    
    client = OpenAI(
        api_key=api_key,
        base_url="https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop"
    )
  4. Monitor Usage

    • Regularly visit Console
    • Check remaining quota and usage statistics
    • Set quota alerts

๐ŸŽฏ Optimization Tips

Leverage Permanently Free Models:

  • ERNIE-3.5-8K already exceeds GPT-3.5 Turbo
  • Suitable for most use cases
  • Zero-cost development and operations

Optimize QPS Usage:

  • Allocate concurrent requests reasonably
  • Excess QPS will queue
  • Consider paid upgrade for higher QPS

Result Caching:

import hashlib
import json

cache = {}

def cached_completion(model, messages):
    key = hashlib.md5(
        json.dumps({"model": model, "messages": messages}).encode()
    ).hexdigest()
    
    if key in cache:
        return cache[key]
    
    response = client.chat.completions.create(
        model=model,
        messages=messages
    )
    
    cache[key] = response
    return response

โš ๏ธ Notes

  1. Real-name Verification Required: Must complete verification for API use
  2. QPS Limits: Free models have QPS of 50, excess will queue
  3. Content Moderation: Complies with Chinese laws, sensitive content rejected
  4. Token Calculation: Chinese calculated by characters, ~1.5-2 chars = 1 token

๐Ÿ”ง FAQ

1. How to get API Key?

Steps:

  1. Login to Baidu Cloud Qianfan Platform
  2. Complete real-name verification
  3. Create application
  4. Get API Key and Secret Key

2. Are permanently free models really unlimited?

Answer: Unlimited tokens! ERNIE-3.5-8K and ERNIE-Speed-8K are permanently free with unlimited token quota, but have QPS (50 requests/second) rate limits. Suitable for small to medium-scale applications.

3. How to use OpenAI SDK?

Answer: Baidu Qianfan supports OpenAI SDK, just modify base_url:

client = OpenAI(
    api_key="YOUR_BAIDU_API_KEY",
    base_url="https://aip.baidubce.com/rpc/2.0/ai_custom/v1/wenxinworkshop"
)

4. How does ERNIE-3.5 compare to GPT-3.5?

Answer: According to official data and testing:

  • Chinese: ERNIE-3.5 > GPT-3.5 Turbo
  • English: ERNIE-3.5 โ‰ˆ GPT-3.5 Turbo
  • Price: ERNIE-3.5 permanently free, GPT-3.5 paid

5. How to get new user gift?

Answer:

  • Register Baidu Cloud account
  • Complete real-name verification
  • Automatically receive 1M tokens of ERNIE-4.0-8K in first month
  • Valid for first month after registration

6. What payment methods are supported?

Answer:

  • Alipay
  • WeChat Pay
  • Bank transfer
  • Bank card

๐Ÿ“Š Performance Comparison

Comparison with Other APIs

APIFree TypeQuotaChineseOpenAI Compatible
Baidu Qianfan๐Ÿ† Permanently FreeUnlimited๐Ÿ† Topโœ…
Google AI StudioPermanently FreeFreeGoodโŒ
DeepSeekTrial Creditsยฅ5 (7 days)๐Ÿ† Topโœ…
GroqFree Service~14,400 req/dayGoodโœ…

Price Advantage

ScenarioBaidu QianfanGPT-3.5 TurboSavings
1M tokensFree~$1.5100%
10M tokensFree~$15100%
100M tokensFree~$150100%

๐ŸŒŸ Practical Cases

Case 1: AI Customer Service

Python
def ai_customer_service(user_question):
    """AI customer service assistant"""
    response = client.chat.completions.create(
        model="ernie-3.5-8k",  # Use permanently free model
        messages=[
            {
                "role": "system",
                "content": "You are a professional customer service assistant"
            },
            {
                "role": "user",
                "content": user_question
            }
        ]
    )
    return response.choices[0].message.content

# Usage
answer = ai_customer_service("What are the features of your product?")
print(answer)

Case 2: Document Summarization

Python
def summarize_document(document_text):
    """Generate document summary"""
    response = client.chat.completions.create(
        model="ernie-3.5-8k",
        messages=[
            {
                "role": "system",
                "content": "You are a document analysis expert"
            },
            {
                "role": "user",
                "content": f"Please summarize the core content:\n\n{document_text}"
            }
        ]
    )
    return response.choices[0].message.content

Case 3: Content Moderation

Python
def content_moderation(user_content):
    """Content moderation"""
    response = client.chat.completions.create(
        model="ernie-speed-8k",  # Use ultra-fast model
        messages=[
            {
                "role": "system",
                "content": "You are a content moderation expert"
            },
            {
                "role": "user",
                "content": f"Judge if this content is compliant:\n{user_content}"
            }
        ]
    )
    return response.choices[0].message.content

๐Ÿ“š Related Resources

Official Documentation

Tools and Resources


๐Ÿ“ Update Log

  • 2026-01-28: Document verification and optimization, clarified quota limits
  • 2025: ERNIE-3.5-8K and ERNIE-Speed-8K permanently free
  • March 2025: Released ERNIE 4.5 and ERNIE X1 new models
  • Continuous Updates: Ongoing optimization of API performance and stability

Service Provider: Baidu

Last updated on